<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <link rel="stylesheet" href="../../aosa.css" type="text/css">
    <title>500 Lines or Less: A Template Engine</title>
  </head>
  <body>

    <div class="titlebox">
      <h1>500 Lines or Less<br>A Template Engine</h1>
      <p class="author">Ned Batchelder</p>
    </div>

    <p><em>Ned Batchelder is a software engineer with a long career, currently working at edX to build open source software to educate the world. He's the maintainer of coverage.py, an organizer of Boston Python, and has spoken at many PyCons. He blogs at <a href="http://nedbatchelder.com">http://nedbatchelder.com</a>. He once had dinner at the White House.</em></p>

<h2 id="introduction">Introduction</h2>

<p>Most programs contain a lot of logic, and a little bit of literal textual data. Programming languages are designed to be good for this sort of programming. But some programming tasks involve only a little bit of logic, and a great deal of textual data. For these tasks, we'd like to have a tool better suited to these text-heavy problems. A template engine is such a tool. In this chapter, we build a simple template engine.</p>

<p>The most common example of one of these text-heavy tasks is in web applications. An important phase in any web application is generating HTML to be served to the browser. Very few HTML pages are completely static: they involve at least a small amount of dynamic data, such as the user's name. Usually, they contain a great deal of dynamic data: product listings, friends' news updates, and so on.</p>

<p>At the same time, every HTML page contains large swaths of static text. And these pages are large, containing tens of thousands of bytes of text. The web application developer has a problem to solve: how best to generate a large string containing a mix of static and dynamic data? To add to the problem, the static text is actually HTML markup that is authored by another member of the team, the front-end designer, who wants to be able to work with it in familiar ways.</p>

<p>For purposes of illustration, let's imagine we want to produce this toy HTML:</p>

<pre class="sourceCode html"><code class="sourceCode html"><span class="kw">&lt;p&gt;</span>Welcome, Charlie!<span class="kw">&lt;/p&gt;</span>
<span class="kw">&lt;p&gt;</span>Products:<span class="kw">&lt;/p&gt;</span>
<span class="kw">&lt;ul&gt;</span>
    <span class="kw">&lt;li&gt;</span>Apple: $1.00<span class="kw">&lt;/li&gt;</span>
    <span class="kw">&lt;li&gt;</span>Fig: $1.50<span class="kw">&lt;/li&gt;</span>
    <span class="kw">&lt;li&gt;</span>Pomegranate: $3.25<span class="kw">&lt;/li&gt;</span>
<span class="kw">&lt;/ul&gt;</span></code></pre>

<p>Here, the user's name will be dynamic, as will the names and prices of the products. Even the number of products isn't fixed: at another moment, there could be more or fewer products to display.</p>

<p>One way to make this HTML would be to have string constants in our code, and join them together to produce the page. Dynamic data would be inserted with string substitution of some sort. Some of our dynamic data is repetitive, like our lists of products. This means we'll have chunks of HTML that repeat, so those will have to be handled separately and combined with the rest of the page.</p>

<p>Producing our toy page in this way might look like this:</p>

<pre class="sourceCode python"><code class="sourceCode python"><span class="co"># The main HTML for the whole page.</span>
PAGE_HTML = <span class="st">&quot;&quot;&quot;</span>
<span class="st">&lt;p&gt;Welcome, </span><span class="ot">{name}</span><span class="st">!&lt;/p&gt;</span>
<span class="st">&lt;p&gt;Products:&lt;/p&gt;</span>
<span class="st">&lt;ul&gt;</span>
<span class="ot">{products}</span>
<span class="st">&lt;/ul&gt;</span>
<span class="st">&quot;&quot;&quot;</span>

<span class="co"># The HTML for each product displayed.</span>
PRODUCT_HTML = <span class="st">&quot;&lt;li&gt;</span><span class="ot">{prodname}</span><span class="st">: </span><span class="ot">{price}</span><span class="st">&lt;/li&gt;</span><span class="ch">\n</span><span class="st">&quot;</span>

<span class="kw">def</span> make_page(username, products):
    product_html = <span class="st">&quot;&quot;</span>
    <span class="kw">for</span> prodname, price in products:
        product_html += PRODUCT_HTML.<span class="dt">format</span>(
            prodname=prodname, price=format_price(price))
    html = PAGE_HTML.<span class="dt">format</span>(name=username, products=product_html)
    <span class="kw">return</span> html</code></pre>

<p>This works, but we have a mess on our hands. The HTML is in multiple string constants embedded in our application code. The logic of the page is hard to see because the static text is broken into separate pieces. The details of how data is formatted is lost in the Python code. In order to modify the HTML page, our front-end designer would need to be able to edit Python code to make HTML changes. Imagine what the code would look like if the page were ten (or one hundred) times more complicated; it would quickly become unworkable.</p>

<h2 id="templates">Templates</h2>

<p>The better way to produce HTML pages is with <em>templates</em>. The HTML page is authored as a template, meaning that the file is mostly static HTML, with dynamic pieces embedded in it using special notation. Our toy page above could look like this as a template:</p>

<pre class="sourceCode html"><code class="sourceCode html"><span class="kw">&lt;p&gt;</span>Welcome, {{user_name}}!<span class="kw">&lt;/p&gt;</span>
<span class="kw">&lt;p&gt;</span>Products:<span class="kw">&lt;/p&gt;</span>
<span class="kw">&lt;ul&gt;</span>
{% for product in product_list %}
    <span class="kw">&lt;li&gt;</span>{{ product.name }}:
        {{ product.price|format_price }}<span class="kw">&lt;/li&gt;</span>
{% endfor %}
<span class="kw">&lt;/ul&gt;</span></code></pre>

<p>Here the focus is on the HTML text, with logic embedded in the HTML. Contrast this document-centric approach with our logic-centric code above. Our earlier program was mostly Python code, with HTML embedded in the Python logic. Here our program is mostly static HTML markup.</p>

<p>The mostly-static style used in templates is the opposite of how most programming languages work. For example, with Python, most of the source file is executable code, and if you need literal static text, you embed it in a string literal:</p>

<pre class="sourceCode python"><code class="sourceCode python"><span class="kw">def</span> hello():
    <span class="dt">print</span>(<span class="st">&quot;Hello, world!&quot;</span>)

hello()</code></pre>

<p>When Python reads this source file, it interprets text like <code>def hello():</code> as instructions to be executed. The double quote character in <code>print(&quot;Hello, world!&quot;)</code> indicates that the following text is meant literally, until the closing double quote. This is how most programming languages work: mostly dynamic, with some static pieces embedded in the instructions. The static pieces are indicated by the double-quote notation.</p>

<p>A template language flips this around: the template file is mostly static literal text, with special notation to indicate the executable dynamic parts.</p>

<pre class="sourceCode html"><code class="sourceCode html"><span class="kw">&lt;p&gt;</span>Welcome, {{user_name}}!<span class="kw">&lt;/p&gt;</span></code></pre>

<p>Here the text is meant to appear literally in the resulting HTML page, until the '<code>{{</code>' indicates a switch into dynamic mode, where the <code>user_name</code> variable will be substituted into the output.</p>

<p>String formatting functions such as Python's <code>&quot;foo = {foo}!&quot;.format(foo=17)</code> are examples of mini-languages used to create text from a string literal and the data to be inserted. Templates extend this idea to include constructs like conditionals and loops, but the difference is only of degree.</p>

<p>These files are called templates because they are used to produce many pages with similar structure but differing details.</p>

<p>To use HTML templates in our programs, we need a <em>template engine</em>: a function that takes a static template describing the structure and static content of the page, and a dynamic <em>context</em> that provides the dynamic data to plug into the template. The template engine combines the template and the context to produce a complete string of HTML. The job of a template engine is to interpret the template, replacing the dynamic pieces with real data.</p>

<p>By the way, there's often nothing particular about HTML in a template engine, it could be used to produce any textual result. For example, they are also used to produce plain-text email messages. But usually they are used for HTML, and occasionally have HTML-specific features, such as escaping, which makes it possible to insert values into the HTML without worrying about which characters are special in HTML.</p>

<h2 id="supported-syntax">Supported Syntax</h2>

<p>Template engines vary in the syntax they support. Our template syntax is based on Django, a popular web framework. Since we are implementing our engine in Python, some Python concepts will appear in our syntax. We've already seen some of this syntax in our toy example at the top of the chapter, but this is a quick summary of all of the syntax we'll implement.</p>

<p>Data from the context is inserted using double curly braces:</p>

<pre class="sourceCode html"><code class="sourceCode html"><span class="kw">&lt;p&gt;</span>Welcome, {{user_name}}!<span class="kw">&lt;/p&gt;</span></code></pre>

<p>The data available to the template is provided in the context when the template is rendered. More on that later.</p>

<p>Template engines usually provide access to elements within data using a simplified and relaxed syntax. In Python, these expressions all have different effects:</p>

<pre class="sourceCode python"><code class="sourceCode python"><span class="dt">dict</span>[<span class="st">&quot;key&quot;</span>]
obj.attr
obj.method()</code></pre>

<p>In our template syntax, all of these operations are expressed with a dot:</p>

<pre><code>dict.key
obj.attr
obj.method</code></pre>

<p>The dot will access object attributes or dictionary values, and if the resulting value is callable, it's automatically called. This is different than the Python code, where you need to use different syntax for those operations. This results in simpler template syntax:</p>

<pre class="sourceCode html"><code class="sourceCode html"><span class="kw">&lt;p&gt;</span>The price is: {{product.price}}, with a {{product.discount}}% discount.<span class="kw">&lt;/p&gt;</span></code></pre>

<p>You can use functions called <em>filters</em> to modify values. Filters are invoked with a pipe character:</p>

<pre class="sourceCode html"><code class="sourceCode html"><span class="kw">&lt;p&gt;</span>Short name: {{story.subject|slugify|lower}}<span class="kw">&lt;/p&gt;</span></code></pre>

<p>Building interesting pages usually requires at least a small amount of decision-making, so conditionals are available:</p>

<pre class="sourceCode html"><code class="sourceCode html">{% if user.is_logged_in %}
    <span class="kw">&lt;p&gt;</span>Welcome, {{ user.name }}!<span class="kw">&lt;/p&gt;</span>
{% endif %}</code></pre>

<p>Looping lets us include collections of data in our pages:</p>

<pre class="sourceCode html"><code class="sourceCode html"><span class="kw">&lt;p&gt;</span>Products:<span class="kw">&lt;/p&gt;</span>
<span class="kw">&lt;ul&gt;</span>
{% for product in product_list %}
    <span class="kw">&lt;li&gt;</span>{{ product.name }}: {{ product.price|format_price }}<span class="kw">&lt;/li&gt;</span>
{% endfor %}
<span class="kw">&lt;/ul&gt;</span></code></pre>

<p>As with other programming languages, conditionals and loops can be nested to build complex logical structures.</p>

<p>Lastly, so that we can document our templates, comments appear between brace-hashes:</p>

<pre class="sourceCode html"><code class="sourceCode html">{# This is the best template ever! #}</code></pre>

<h2 id="implementation-approaches">Implementation Approaches</h2>

<p>In broad strokes, the template engine will have two main phases: <em>parsing</em> the template, and then <em>rendering</em> the template.</p>

<p>Rendering the template specifically involves:</p>

<ul>
<li>Managing the dynamic context, the source of the data</li>
<li>Executing the logic elements</li>
<li>Implementing dot access and filter execution</li>
</ul>

<p>The question of what to pass from the parsing phase to the rendering phase is key. What does parsing produce that can be rendered? There are two main options; we'll call them <em>interpretation</em> and <em>compilation</em>, using the terms loosely from other language implementations.</p>

<p>In an interpretation model, parsing produces a data structure representing the structure of the template. The rendering phase walks that data structure, assembling the result text based on the instructions it finds. For a real-world example, the Django template engine uses this approach.</p>

<p>In a compilation model, parsing produces some form of directly executable code. The rendering phase executes that code, producing the result. Jinja2 and Mako are two examples of template engines that use the compilation approach.</p>

<p>Our implementation of the engine uses compilation: we compile the template into Python code. When run, the Python code assembles the result.</p>

<p>The template engine described here was originally written as part of coverage.py, to produce HTML reports. In coverage.py, there are only a few templates, and they are used over and over to produce many files from the same template. Overall, the program ran faster if the templates were compiled to Python code, because even though the compilation process was a bit more complicated, it only had to run once, while the execution of the compiled code ran many times, and was faster than interpreting a data structure many times.</p>

<p>It's a bit more complicated to compile the template to Python, but it's not as bad as you might think. And besides, as any developer can tell you, it's more fun to write a program to write a program than it is to write a program!</p>

<p>Our template compiler is a small example of a general technique called code generation. Code generation underlies many powerful and flexible tools, including programming language compilers. Code generation can get complex, but is a useful technique to have in your toolbox.</p>

<p>Another application of templates might prefer the interpreted approach, if templates will be used only a few times each. Then the effort to compile to Python won't pay off in the long run, and a simpler interpretation process might perform better overall.</p>

<h2 id="compiling-to-python">Compiling to Python</h2>

<p>Before we get to the code of the template engine, let's look at the code it produces. The parsing phase will convert a template into a Python function. Here is our small template again:</p>

<pre class="sourceCode html"><code class="sourceCode html"><span class="kw">&lt;p&gt;</span>Welcome, {{user_name}}!<span class="kw">&lt;/p&gt;</span>
<span class="kw">&lt;p&gt;</span>Products:<span class="kw">&lt;/p&gt;</span>
<span class="kw">&lt;ul&gt;</span>
{% for product in product_list %}
    <span class="kw">&lt;li&gt;</span>{{ product.name }}:
        {{ product.price|format_price }}<span class="kw">&lt;/li&gt;</span>
{% endfor %}
<span class="kw">&lt;/ul&gt;</span></code></pre>

<p>Our engine will compile this template to Python code. The resulting Python code looks unusual, because we've chosen some shortcuts that produce slightly faster code. Here is the Python (slightly reformatted for readability):</p>

<pre class="sourceCode python"><code class="sourceCode python"><span class="kw">def</span> render_function(context, do_dots):
    c_user_name = context[<span class="st">&#39;user_name&#39;</span>]
    c_product_list = context[<span class="st">&#39;product_list&#39;</span>]
    c_format_price = context[<span class="st">&#39;format_price&#39;</span>]

    result = []
    append_result = result.append
    extend_result = result.extend
    to_str = <span class="dt">str</span>

    extend_result([
        <span class="st">&#39;&lt;p&gt;Welcome, &#39;</span>,
        to_str(c_user_name),
        <span class="st">&#39;!&lt;/p&gt;</span><span class="ch">\n</span><span class="st">&lt;p&gt;Products:&lt;/p&gt;</span><span class="ch">\n</span><span class="st">&lt;ul&gt;</span><span class="ch">\n</span><span class="st">&#39;</span>
    ])
    <span class="kw">for</span> c_product in c_product_list:
        extend_result([
            <span class="st">&#39;</span><span class="ch">\n</span><span class="st">    &lt;li&gt;&#39;</span>,
            to_str(do_dots(c_product, <span class="st">&#39;name&#39;</span>)),
            <span class="co">&#39;:\n        &#39;</span>,
            to_str(c_format_price(do_dots(c_product, <span class="st">&#39;price&#39;</span>))),
            <span class="co">&#39;&lt;/li&gt;\n&#39;</span>
        ])
    append_result(<span class="st">&#39;</span><span class="ch">\n</span><span class="st">&lt;/ul&gt;</span><span class="ch">\n</span><span class="st">&#39;</span>)
    <span class="kw">return</span> <span class="st">&#39;&#39;</span>.join(result)</code></pre>

<p>Each template is converted into a <code>render_function</code> function that takes a dictionary of data called the context. The body of the function starts by unpacking the data from the context into local names, because they are faster for repeated use. All the context data goes into locals with a <code>c_</code> prefix so that we can use other local names without fear of collisions.</p>

<p>The result of the template will be a string. The fastest way to build a string from parts is to create a list of strings, and join them together at the end. <code>result</code> will be the list of strings. Because we're going to add strings to this list, we capture its <code>append</code> and <code>extend</code> methods in the local names <code>result_append</code> and <code>result_extend</code>. The last local we create is a <code>to_str</code> shorthand for the <code>str</code> built-in.</p>

<p>These kinds of shortcuts are unusual. Let's look at them more closely. In Python, a method call on an object like <code>result.append(&quot;hello&quot;)</code> is executed in two steps. First, the append attribute is fetched from the result object: <code>result.append</code>. Then the value fetched is invoked as a function, passing it the argument <code>&quot;hello&quot;</code>. Although we're used to seeing those steps performed together, they really are separate. If you save the result of the first step, you can perform the second step on the saved value. So these two Python snippets do the same thing:</p>

<pre class="sourceCode python"><code class="sourceCode python"><span class="co"># The way we&#39;re used to seeing it:</span>
result.append(<span class="st">&quot;hello&quot;</span>)

<span class="co"># But this works the same:</span>
append_result = result.append
append_result(<span class="st">&quot;hello&quot;</span>)</code></pre>

<p>In the template engine code, we've split it out this way so that we only do the first step once, no matter how many times we do the second step. This saves us a small amount of time, because we avoid taking the time to look up the append attribute.</p>

<p>This is an example of a micro-optimization: an unusual coding technique that gains us tiny improvements in speed. Micro-optimizations can be less readable, or more confusing, so they are only justified for code that is a proven performance bottleneck. Developers disagree on how much micro-optimization is justified, and some beginners overdo it. The optimizations here were added only after timing experiments showed that they improved performance, even if only a little bit. Micro-optimizations can be instructive, as they make use of some exotic aspects of Python, but don't over-use them in your own code.</p>

<p>The shortcut for <code>str</code> is also a micro-optimization. Names in Python can be local to a function, global to a module, or built-in to Python. Looking up a local name is faster than looking up a global or a built-in. We're used to the fact that <code>str</code> is a builtin that is always available, but Python still has to look up the name <code>str</code> each time it is used. Putting it in a local saves us another small slice of time because locals are faster than builtins.</p>

<p>Once those shortcuts are defined, we're ready for the Python lines created from our particular template. Strings will be added to the result list using the <code>append_result</code> or <code>extend_result</code> shorthands, depending on whether we have one string to add, or more than one. Literal text in the template becomes a simple string literal.</p>

<p>Having both append and extend adds complexity, but remember we're aiming for the fastest execution of the template, and using extend for one item means making a new list of one item so that we can pass it to extend.</p>

<p>Expressions in <code>{{ ... }}</code> are computed, converted to strings, and added to the result. Dots in the expression are handled by the <code>do_dots</code> function passed into our function, because the meaning of the dotted expressions depends on the data in the context: it could be attribute access or item access, and it could be a callable.</p>

<p>The logical structures <code>{% if ... %}</code> and <code>{% for ... %}</code> are converted into Python conditionals and loops. The expression in the <code>{% if/for ... %}</code> tag will become the expression in the <code>if</code> or <code>for</code> statement, and the contents up until the <code>{% end... %}</code> tag will become the body of the statement.</p>

<!-- [[[cog from cogutil import include ]]] -->

<!-- [[[end]]] -->

<h2 id="writing-the-engine">Writing the Engine</h2>

<p>Now that we understand what the engine will do, let's walk through the implementation.</p>

<h3 id="the-templite-class">The Templite class</h3>

<p>The heart of the template engine is the Templite class. (Get it? It's a template, but it's lite!)</p>

<p>The Templite class has a small interface. You construct a Templite object with the text of the template, then later you can use the <code>render</code> method on it to render a particular context, the dictionary of data, through the template:</p>

<pre class="sourceCode python"><code class="sourceCode python"><span class="co"># Make a Templite object.</span>
templite = Templite(<span class="st">&#39;&#39;&#39;</span>
<span class="st">    &lt;h1&gt;Hello </span><span class="ot">{{</span><span class="st">name|upper</span><span class="ot">}}</span><span class="st">!&lt;/h1&gt;</span>
<span class="st">    {</span><span class="ot">% f</span><span class="st">or topic in topics %}</span>
<span class="st">        &lt;p&gt;You are interested in </span><span class="ot">{{</span><span class="st">topic</span><span class="ot">}}</span><span class="st">.&lt;/p&gt;</span>
<span class="st">    {</span><span class="ot">% e</span><span class="st">ndfor %}</span>
<span class="st">    &#39;&#39;&#39;</span>,
    {<span class="st">&#39;upper&#39;</span>: <span class="dt">str</span>.upper},
)

<span class="co"># Later, use it to render some data.</span>
text = templite.render({
    <span class="st">&#39;name&#39;</span>: <span class="st">&quot;Ned&quot;</span>,
    <span class="co">&#39;topics&#39;</span>: [<span class="st">&#39;Python&#39;</span>, <span class="st">&#39;Geometry&#39;</span>, <span class="st">&#39;Juggling&#39;</span>],
})</code></pre>

<p>We pass the text of the template when the object is created so that we can do the compile step just once, and later call <code>render</code> many times to reuse the compiled results.</p>

<p>The constructor also accepts a dictionary of values, an initial context. These are stored in the Templite object, and will be available when the template is later rendered. These are good for defining functions or constants we want to be available everywhere, like <code>upper</code> in the previous example.</p>

<p>Before we discuss the implementation of Templite, we have a helper to define first: CodeBuilder.</p>

<h3 id="codebuilder">CodeBuilder</h3>

<p>The bulk of the work in our engine is parsing the template and producing the necessary Python code. To help with producing the Python, we have the CodeBuilder class, which handles the bookkeeping for us as we construct the Python code. It adds lines of code, manages indentation, and finally gives us values from the compiled Python.</p>

<p>One CodeBuilder object is responsible for a complete chunk of Python code. As used by our template engine, the chunk of Python is always a single complete function definition. But the CodeBuilder class makes no assumption that it will only be one function. This keeps the CodeBuilder code more general, and less coupled to the rest of the template engine code.</p>

<p>As we'll see, we also use nested CodeBuilders to make it possible to put code at the beginning of the function even though we don't know what it will be until we are nearly done.</p>

<p>A CodeBuilder object keeps a list of strings that will together be the final Python code. The only other state it needs is the current indentation level:</p>

<!-- [[[cog include("templite.py", first="class CodeBuilder", numblanks=2) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python"><span class="kw">class</span> CodeBuilder(<span class="dt">object</span>):
    <span class="co">&quot;&quot;&quot;Build source code conveniently.&quot;&quot;&quot;</span>

    <span class="kw">def</span> <span class="ot">__init__</span>(<span class="ot">self</span>, indent=<span class="dv">0</span>):
        <span class="ot">self</span>.code = []
        <span class="ot">self</span>.indent_level = indent</code></pre>

<!-- [[[end]]] -->

<p>CodeBuilder doesn't do much. <code>add_line</code> adds a new line of code, which automatically indents the text to the current indentation level, and supplies a newline:</p>

<!-- [[[cog include("templite.py", first="def add_line", numblanks=3, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">    <span class="kw">def</span> add_line(<span class="ot">self</span>, line):
        <span class="co">&quot;&quot;&quot;Add a line of source to the code.</span>

<span class="co">        Indentation and newline will be added for you, don&#39;t provide them.</span>

<span class="co">        &quot;&quot;&quot;</span>
        <span class="ot">self</span>.code.extend([<span class="st">&quot; &quot;</span> * <span class="ot">self</span>.indent_level, line, <span class="st">&quot;</span><span class="ch">\n</span><span class="st">&quot;</span>])</code></pre>

<!-- [[[end]]] -->

<p><code>indent</code> and <code>dedent</code> increase or decrease the indentation level:</p>

<!-- [[[cog include("templite.py", first="INDENT_STEP = 4", numblanks=3, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">    INDENT_STEP = <span class="dv">4</span>      <span class="co"># PEP8 says so!</span>

    <span class="kw">def</span> indent(<span class="ot">self</span>):
        <span class="co">&quot;&quot;&quot;Increase the current indent for following lines.&quot;&quot;&quot;</span>
        <span class="ot">self</span>.indent_level += <span class="ot">self</span>.INDENT_STEP

    <span class="kw">def</span> dedent(<span class="ot">self</span>):
        <span class="co">&quot;&quot;&quot;Decrease the current indent for following lines.&quot;&quot;&quot;</span>
        <span class="ot">self</span>.indent_level -= <span class="ot">self</span>.INDENT_STEP</code></pre>

<!-- [[[end]]] -->

<p><code>add_section</code> is managed by another CodeBuilder object. This lets us keep a reference to a place in the code, and add text to it later. The <code>self.code</code> list is mostly a list of strings, but will also hold references to these sections:</p>

<!-- [[[cog include("templite.py", first="def add_section", numblanks=1, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">    <span class="kw">def</span> add_section(<span class="ot">self</span>):
        <span class="co">&quot;&quot;&quot;Add a section, a sub-CodeBuilder.&quot;&quot;&quot;</span>
        section = CodeBuilder(<span class="ot">self</span>.indent_level)
        <span class="ot">self</span>.code.append(section)
        <span class="kw">return</span> section</code></pre>

<!-- [[[end]]] -->

<p><code>__str__</code> produces a single string with all the code. This simply joins together all the strings in <code>self.code</code>. Note that because <code>self.code</code> can contain sections, this might call other <code>CodeBuilder</code> objects recursively:</p>

<!-- [[[cog include("templite.py", first="def __str__", numblanks=1, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">    <span class="kw">def</span> <span class="ot">__str__</span>(<span class="ot">self</span>):
        <span class="kw">return</span> <span class="st">&quot;&quot;</span>.join(<span class="dt">str</span>(c) <span class="kw">for</span> c in <span class="ot">self</span>.code)</code></pre>

<!-- [[[end]]] -->

<p><code>get_globals</code> yields the final values by executing the code. This stringifies the object, executes it to get its definitions, and returns the resulting values:</p>

<!-- [[[cog include("templite.py", first="def get_globals", numblanks=1, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">    <span class="kw">def</span> get_globals(<span class="ot">self</span>):
        <span class="co">&quot;&quot;&quot;Execute the code, and return a dict of globals it defines.&quot;&quot;&quot;</span>
        <span class="co"># A check that the caller really finished all the blocks they started.</span>
        <span class="kw">assert</span> <span class="ot">self</span>.indent_level == <span class="dv">0</span>
        <span class="co"># Get the Python source as a single string.</span>
        python_source = <span class="dt">str</span>(<span class="ot">self</span>)
        <span class="co"># Execute the source, defining globals, and return them.</span>
        global_namespace = {}
        <span class="dt">exec</span>(python_source, global_namespace)
        <span class="kw">return</span> global_namespace</code></pre>

<!-- [[[end]]] -->

<p>This last method uses some exotic features of Python. The <code>exec</code> function executes a string containing Python code. The second argument to <code>exec</code> is a dictionary that will collect up the globals defined by the code. So for example, if we do this:</p>

<pre class="sourceCode python"><code class="sourceCode python">python_source = <span class="st">&quot;&quot;&quot;\</span>
<span class="st">SEVENTEEN = 17</span>

<span class="st">def three():</span>
<span class="st">    return 3</span>
<span class="st">&quot;&quot;&quot;</span>
global_namespace = {}
<span class="dt">exec</span>(python_source, global_namespace)</code></pre>

<p>then <code>global_namespace['SEVENTEEN']</code> is 17, and <code>global_namespace['three']</code> is an actual function named <code>three</code>.</p>

<p>Although we only use CodeBuilder to produce one function, there's nothing here that limits it to that use. This makes the class simpler to implement, and easier to understand.</p>

<p>CodeBuilder lets us create a chunk of Python source code, and has no specific knowledge about our template engine at all. We could use it in such a way that three different functions would be defined in the Python, and then <code>get_globals</code> would return a dict of three values, the three functions. As it happens, our template engine only needs to define one function. But it's better software design to keep that implementation detail in the template engine code, and out of our CodeBuilder class.</p>

<p>Even as we're actually using it—to define a single function—having <code>get_globals</code> return the dictionary keeps the code more modular because it doesn't need to know the name of the function we've defined. Whatever function name we define in our Python source, we can retrieve that name from the dict returned by <code>get_globals</code>.</p>

<p>Now we can get into the implementation of the Templite class itself, and see how and where CodeBuilder is used.</p>

<h3 id="the-templite-class-implementation">The Templite class implementation</h3>

<p>Most of our code is in the Templite class. As we've discussed, it has both a compilation and a rendering phase.</p>

<h4 id="compiling">Compiling</h4>

<p>All of the work to compile the template into a Python function happens in the Templite constructor. First the contexts are saved away:</p>

<!-- [[[cog include("templite.py", first="def __init__(self, text, ", numblanks=3, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">    <span class="kw">def</span> <span class="ot">__init__</span>(<span class="ot">self</span>, text, *contexts):
        <span class="co">&quot;&quot;&quot;Construct a Templite with the given `text`.</span>

<span class="co">        `contexts` are dictionaries of values to use for future renderings.</span>
<span class="co">        These are good for filters and global values.</span>

<span class="co">        &quot;&quot;&quot;</span>
        <span class="ot">self</span>.context = {}
        <span class="kw">for</span> context in contexts:
            <span class="ot">self</span>.context.update(context)</code></pre>

<!-- [[[end]]] -->

<p>Notice we used <code>*contexts</code> as the parameter. The asterisk denotes that any number of positional arguments will be packed into a tuple and passed in as <code>contexts</code>. This is called argument unpacking, and means that the caller can provide a number of different context dictionaries. Now any of these calls are valid:</p>

<pre class="sourceCode python"><code class="sourceCode python">t = Templite(template_text)
t = Templite(template_text, context1)
t = Templite(template_text, context1, context2)</code></pre>

<p>The context arguments (if any) are supplied to the constructor as a tuple of contexts. We can then iterate over the <code>contexts</code> tuple, dealing with each of them in turn. We simply create one combined dictionary called <code>self.context</code> which has the contents of all of the supplied contexts. If duplicate names are provided in the contexts, the last one wins.</p>

<p>To make our compiled function as fast as possible, we extract context variables into Python locals. We'll get those names by keeping a set of variable names we encounter, but we also need to track the names of variables defined in the template, the loop variables:</p>

<!-- [[[cog include("templite.py", first="self.all_vars", numblanks=1, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">        <span class="ot">self</span>.all_vars = <span class="dt">set</span>()
        <span class="ot">self</span>.loop_vars = <span class="dt">set</span>()</code></pre>

<!-- [[[end]]] -->

<p>Later we'll see how these get used to help construct the prologue of our function. First, we'll use the CodeBuilder class we wrote earlier to start to build our compiled function:</p>

<!-- [[[cog include("templite.py", first="code = CodeBuilder", numblanks=2, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">        code = CodeBuilder()

        code.add_line(<span class="st">&quot;def render_function(context, do_dots):&quot;</span>)
        code.indent()
        vars_code = code.add_section()
        code.add_line(<span class="st">&quot;result = []&quot;</span>)
        code.add_line(<span class="st">&quot;append_result = result.append&quot;</span>)
        code.add_line(<span class="st">&quot;extend_result = result.extend&quot;</span>)
        code.add_line(<span class="st">&quot;to_str = str&quot;</span>)</code></pre>

<!-- [[[end]]] -->

<p>Here we construct our CodeBuilder object, and start writing lines into it. Our Python function will be called <code>render_function</code>, and will take two arguments: <code>context</code> is the data dictionary it should use, and <code>do_dots</code> is a function implementing dot attribute access.</p>

<p>The context here is the combination of the data context passed to the Templite constructor, and the data context passed to the render function. It's the complete set of data available to the template that we made in the Templite constructor.</p>

<p>Notice that CodeBuilder is very simple: it doesn't &quot;know&quot; about function definitions, just lines of code. This keeps CodeBuilder simple, both in its implementation, and in its use. We can read our generated code here without having to mentally interpolate too many specialized CodeBuilder.</p>

<p>We create a section called <code>vars_code</code>. Later we'll write the variable extraction lines into that section. The <code>vars_code</code> object lets us save a place in the function that can be filled in later when we have the information we need.</p>

<p>Then four fixed lines are written, defining a result list, shortcuts for the methods to append to or extend that list, and a shortcut for the <code>str()</code> builtin. As we discussed earlier, this odd step squeezes just a little bit more performance out of our rendering function.</p>

<p>The reason we have both the <code>append</code> and the <code>extend</code> shortcut is so we can use the most effective method, depending on whether we have one line to add to our result, or more than one.</p>

<p>Next we define an inner function to help us with buffering output strings:</p>

<!-- [[[cog include("templite.py", first="buffered =", numblanks=1, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">        buffered = []
        <span class="kw">def</span> flush_output():
            <span class="co">&quot;&quot;&quot;Force `buffered` to the code builder.&quot;&quot;&quot;</span>
            <span class="kw">if</span> <span class="dt">len</span>(buffered) == <span class="dv">1</span>:
                code.add_line(<span class="st">&quot;append_result(</span><span class="ot">%s</span><span class="st">)&quot;</span> % buffered[<span class="dv">0</span>])
            <span class="kw">elif</span> <span class="dt">len</span>(buffered) &gt; <span class="dv">1</span>:
                code.add_line(<span class="st">&quot;extend_result([</span><span class="ot">%s</span><span class="st">])&quot;</span> % <span class="st">&quot;, &quot;</span>.join(buffered))
            <span class="kw">del</span> buffered[:]</code></pre>

<!-- [[[end]]] -->

<p>As we create chunks of output that need to go into our compiled function, we need to turn them into function calls that append to our result. We'd like to combine repeated append calls into one extend call. This is another micro-optimization. To make this possible, we buffer the chunks.</p>

<p>The <code>buffered</code> list holds strings that are yet to be written to our function source code. As our template compilation proceeds, we'll append strings to <code>buffered</code>, and flush them to the function source when we reach control flow points, like if statements, or the beginning or ends of loops.</p>

<p>The <code>flush_output</code> function is a <em>closure</em>, which is a fancy word for a function that refers to variables outside of itself. Here <code>flush_output</code> refers to <code>buffered</code> and <code>code</code>. This simplifies our calls to the function: we don't have to tell <code>flush_output</code> what buffer to flush, or where to flush it; it knows all that implicitly.</p>

<p>If only one string has been buffered, then the <code>append_result</code> shortcut is used to append it to the result. If more than one is buffered, then the <code>extend_result</code> shortcut is used, with all of them, to add them to the result. Then the buffered list is cleared so more strings can be buffered.</p>

<p>The rest of the compiling code will add lines to the function by appending them to <code>buffered</code>, and eventually call <code>flush_output</code> to write them to the CodeBuilder.</p>

<p>With this function in place, we can have a line of code in our compiler like this:</p>

<pre class="sourceCode python"><code class="sourceCode python">buffered.append(<span class="st">&quot;&#39;hello&#39;&quot;</span>)</code></pre>

<p>which will mean that our compiled Python function will have this line:</p>

<pre class="sourceCode python"><code class="sourceCode python">append_result(<span class="st">&#39;hello&#39;</span>)</code></pre>

<p>which will add the string <code>hello</code> to the rendered output of the template. We have multiple levels of abstraction here which can be difficult to keep straight. The compiler uses <code>buffered.append(&quot;'hello'&quot;)</code>, which creates <code>append_result('hello')</code> in the compiled Python function, which when run, appends <code>hello</code> to the template result.</p>

<p>Back to our Templite class. As we parse control structures, we want to check that they are properly nested. The <code>ops_stack</code> list is a stack of strings:</p>

<!-- [[[cog include("templite.py", first="ops_stack", numblanks=1, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">        ops_stack = []</code></pre>

<!-- [[[end]]] -->

<p>When we encounter an <code>{% if .. %}</code> tag (for example), we'll push <code>'if'</code> onto the stack. When we find an <code>{% endif %}</code> tag, we can pop the stack and report an error if there was no <code>'if'</code> at the top of the stack.</p>

<p>Now the real parsing begins. We split the template text into a number of tokens using a regular expression, or <em>regex</em>. Regexes can be daunting: they are a very compact notation for complex pattern matching. They are also very efficient, since the complexity of matching the pattern is implemented in C in the regular expression engine, rather than in your own Python code. Here's our regex:</p>

<!-- [[[cog include("templite.py", first="tokens =", numblanks=1, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">        tokens = re.split(<span class="st">r&quot;(?s)(</span><span class="ot">{{</span><span class="st">.*?</span><span class="ot">}}</span><span class="st">|{%.*?%}|{#.*?#})&quot;</span>, text)</code></pre>

<!-- [[[end]]] -->

<p>This looks complicated; let's break it down.</p>

<p>The <code>re.split</code> function will split a string using a regex. Our pattern is parenthesized, so the matches will be used to split the string, and will also be returned as pieces in the split list. Our pattern will match our tag syntaxes, but we've parenthesized it so that the string will be split at the tags, and the tags will also be returned.</p>

<p>The <code>(?s)</code> flag in the regex means that a dot should match even a newline. Next we have our parenthesized group of three alternatives: <code>{{.*?}}</code> matches an expression, <code>{%.*?%}</code> matches a tag, and <code>{#.*?#}</code> matches a comment. In all of these, we use <code>.*?</code> to match any number of characters, but the shortest sequence that matches.</p>

<p>The result of <code>re.split</code> is a list of strings. For example, this template text:</p>

<pre class="sourceCode html"><code class="sourceCode html"><span class="kw">&lt;p&gt;</span>Topics for {{name}}: {% for t in topics %}{{t}}, {% endfor %}<span class="kw">&lt;/p&gt;</span></code></pre>

<p>would be split into these pieces:</p>

<pre class="sourceCode python"><code class="sourceCode python">[
    <span class="st">&#39;&lt;p&gt;Topics for &#39;</span>,               <span class="co"># literal</span>
    <span class="co">&#39;{{name}}&#39;</span>,                     <span class="co"># expression</span>
    <span class="co">&#39;: &#39;</span>,                           <span class="co"># literal</span>
    <span class="co">&#39;{% for t in topics %}&#39;</span>,        <span class="co"># tag</span>
    <span class="co">&#39;&#39;</span>,                             <span class="co"># literal (empty)</span>
    <span class="co">&#39;{{t}}&#39;</span>,                        <span class="co"># expression</span>
    <span class="co">&#39;, &#39;</span>,                           <span class="co"># literal</span>
    <span class="co">&#39;{% endfor %}&#39;</span>,                 <span class="co"># tag</span>
    <span class="co">&#39;&lt;/p&gt;&#39;</span>                          <span class="co"># literal</span>
]</code></pre>

<p>Once the text is split into tokens like this, we can loop over the tokens, and deal with each in turn. By splitting them according to their type, we can handle each type separately.</p>

<p>The compilation code is a loop over these tokens:</p>

<!-- [[[cog include("templite.py", first="for token", numlines=1, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">        <span class="kw">for</span> token in tokens:</code></pre>

<!-- [[[end]]] -->

<p>Each token is examined to see which of the four cases it is. Just looking at the first two characters is enough. The first case is a comment, which is easy to handle: just ignore it and move on to the next token:</p>

<!-- [[[cog include("templite.py", first="if token.", numlines=3, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">            <span class="kw">if</span> token.startswith(<span class="st">&#39;{#&#39;</span>):
                <span class="co"># Comment: ignore it and move on.</span>
                <span class="kw">continue</span></code></pre>

<!-- [[[end]]] -->

<p>For the case of <code>{{...}}</code> expressions, we cut off the two braces at the front and back, strip off the white space, and pass the entire expression to <code>_expr_code</code>:</p>

<!-- [[[cog include("templite.py", first="elif token.startswith('{{')", numlines=4, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">            <span class="kw">elif</span> token.startswith(<span class="st">&#39;</span><span class="ot">{{</span><span class="st">&#39;</span>):
                <span class="co"># An expression to evaluate.</span>
                expr = <span class="ot">self</span>._expr_code(token[<span class="dv">2</span>:-<span class="dv">2</span>].strip())
                buffered.append(<span class="st">&quot;to_str(</span><span class="ot">%s</span><span class="st">)&quot;</span> % expr)</code></pre>

<!-- [[[end]]] -->

<p>The <code>_expr_code</code> method will compile the template expression into a Python expression. We'll see that function later. We use the <code>to_str</code> function to force the expression's value to be a string, and add that to our result.</p>

<p>The third case is the big one: <code>{% ... %}</code> tags. These are control structures that will become Python control structures. First we have to flush our buffered output lines, then we extract a list of words from the tag:</p>

<!-- [[[cog include("templite.py", first="elif token.startswith('{%')", numlines=4, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">            <span class="kw">elif</span> token.startswith(<span class="st">&#39;{%&#39;</span>):
                <span class="co"># Action tag: split into words and parse further.</span>
                flush_output()
                words = token[<span class="dv">2</span>:-<span class="dv">2</span>].strip().split()</code></pre>

<!-- [[[end]]] -->

<p>Now we have three sub-cases, based on the first word in the tag: <code>if</code>, <code>for</code>, or <code>end</code>. The <code>if</code> case shows our simple error handling and code generation:</p>

<!-- [[[cog include("templite.py", first="if words[0] == 'if'", numlines=7, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">                <span class="kw">if</span> words[<span class="dv">0</span>] == <span class="st">&#39;if&#39;</span>:
                    <span class="co"># An if statement: evaluate the expression to determine if.</span>
                    <span class="kw">if</span> <span class="dt">len</span>(words) != <span class="dv">2</span>:
                        <span class="ot">self</span>._syntax_error(<span class="st">&quot;Don&#39;t understand if&quot;</span>, token)
                    ops_stack.append(<span class="st">&#39;if&#39;</span>)
                    code.add_line(<span class="st">&quot;if </span><span class="ot">%s</span><span class="st">:&quot;</span> % <span class="ot">self</span>._expr_code(words[<span class="dv">1</span>]))
                    code.indent()</code></pre>

<!-- [[[end]]] -->

<p>The <code>if</code> tag should have a single expression, so the <code>words</code> list should have only two elements in it. If it doesn't, we use the <code>_syntax_error</code> helper method to raise a syntax error exception. We push <code>'if'</code> onto <code>ops_stack</code> so that we can check the <code>endif</code> tag. The expression part of the <code>if</code> tag is compiled to a Python expression with <code>_expr_code</code>, and is used as the conditional expression in a Python <code>if</code> statement.</p>

<p>The second tag type is <code>for</code>, which will be compiled to a Python <code>for</code> statement:</p>

<!-- [[[cog include("templite.py", first="elif words[0] == 'for'", numlines=13, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">                <span class="kw">elif</span> words[<span class="dv">0</span>] == <span class="st">&#39;for&#39;</span>:
                    <span class="co"># A loop: iterate over expression result.</span>
                    <span class="kw">if</span> <span class="dt">len</span>(words) != <span class="dv">4</span> or words[<span class="dv">2</span>] != <span class="st">&#39;in&#39;</span>:
                        <span class="ot">self</span>._syntax_error(<span class="st">&quot;Don&#39;t understand for&quot;</span>, token)
                    ops_stack.append(<span class="st">&#39;for&#39;</span>)
                    <span class="ot">self</span>._variable(words[<span class="dv">1</span>], <span class="ot">self</span>.loop_vars)
                    code.add_line(
                        <span class="st">&quot;for c_</span><span class="ot">%s</span><span class="st"> in </span><span class="ot">%s</span><span class="st">:&quot;</span> % (
                            words[<span class="dv">1</span>],
                            <span class="ot">self</span>._expr_code(words[<span class="dv">3</span>])
                        )
                    )
                    code.indent()</code></pre>

<!-- [[[end]]] -->

<p>We do a check of the syntax and push <code>'for'</code> onto the stack. The <code>_variable</code> method checks the syntax of the variable, and adds it to the set we provide. This is how we collect up the names of all the variables during compilation. Later we'll need to write the prologue of our function, where we'll unpack all the variable names we get from the context. To do that correctly, we need to know the names of all the variables we encountered, <code>self.all_vars</code>, and the names of all the variables defined by loops, <code>self.loop_vars</code>.</p>

<p>We add one line to our function source, a <code>for</code> statement. All of our template variables are turned into Python variables by prepending <code>c_</code> to them, so that we know they won't collide with other names we're using in our Python function. We use <code>_expr_code</code> to compile the iteration expression from the template into an iteration expression in Python.</p>

<p>The last kind of tag we handle is an <code>end</code> tag; either <code>{% endif %}</code> or <code>{% endfor %}</code>. The effect on our compiled function source is the same: simply unindent to end the <code>if</code> or <code>for</code> statement that was started earlier:</p>

<!-- [[[cog include("templite.py", first="elif words[0].startswith('end')", numlines=11, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">                <span class="kw">elif</span> words[<span class="dv">0</span>].startswith(<span class="st">&#39;end&#39;</span>):
                    <span class="co"># Endsomething.  Pop the ops stack.</span>
                    <span class="kw">if</span> <span class="dt">len</span>(words) != <span class="dv">1</span>:
                        <span class="ot">self</span>._syntax_error(<span class="st">&quot;Don&#39;t understand end&quot;</span>, token)
                    end_what = words[<span class="dv">0</span>][<span class="dv">3</span>:]
                    <span class="kw">if</span> not ops_stack:
                        <span class="ot">self</span>._syntax_error(<span class="st">&quot;Too many ends&quot;</span>, token)
                    start_what = ops_stack.pop()
                    <span class="kw">if</span> start_what != end_what:
                        <span class="ot">self</span>._syntax_error(<span class="st">&quot;Mismatched end tag&quot;</span>, end_what)
                    code.dedent()</code></pre>

<!-- [[[end]]] -->

<p>Notice here that the actual work needed for the end tag is one line: unindent the function source. The rest of this clause is all error checking to make sure that the template is properly formed. This isn't unusual in program translation code.</p>

<p>Speaking of error handling, if the tag isn't an <code>if</code>, a <code>for</code>, or an <code>end</code>, then we don't know what it is, so raise a syntax error:</p>

<!-- [[[cog include("templite.py", first="else:", numlines=2, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">                <span class="kw">else</span>:
                    <span class="ot">self</span>._syntax_error(<span class="st">&quot;Don&#39;t understand tag&quot;</span>, words[<span class="dv">0</span>])</code></pre>

<!-- [[[end]]] -->

<p>We're done with the three different special syntaxes (<code>{{...}}</code>, <code>{#...#}</code>, and <code>{%...%}</code>). What's left is literal content. We'll add the literal string to the buffered output, using the <code>repr</code> built-in function to produce a Python string literal for the token:</p>

<!-- [[[cog include("templite.py", first="else:", after="Don't understand tag", numblanks=1, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">            <span class="kw">else</span>:
                <span class="co"># Literal content.  If it isn&#39;t empty, output it.</span>
                <span class="kw">if</span> token:
                    buffered.append(<span class="dt">repr</span>(token))</code></pre>

<!-- [[[end]]] -->

<p>If we didn't use <code>repr</code>, then we'd end up with lines like this in our compiled function:</p>

<pre class="sourceCode python"><code class="sourceCode python">append_result(abc)      <span class="co"># Error! abc isn&#39;t defined</span></code></pre>

<p>We need the value to be quoted like this:</p>

<pre class="sourceCode python"><code class="sourceCode python">append_result(<span class="st">&#39;abc&#39;</span>)</code></pre>

<p>The <code>repr</code> function supplies the quotes around the string for us, and also provides backslashes where needed:</p>

<pre class="sourceCode python"><code class="sourceCode python">append_result(<span class="st">&#39;&quot;Don</span><span class="ch">\&#39;</span><span class="st">t you like my hat?&quot; he asked.&#39;</span>)</code></pre>

<p>Notice that we first check if the token is an empty string with <code>if token:</code>, since there's no point adding an empty string to the output. Because our regex is splitting on tag syntax, adjacent tags will have an empty token between them. The check here is an easy way to avoid putting useless <code>append_result(&quot;&quot;)</code> statements into our compiled function.</p>

<p>That completes the loop over all the tokens in the template. When the loop is done, all of the template has been processed. We have one last check to make: if <code>ops_stack</code> isn't empty, then we must be missing an end tag. Then we flush the buffered output to the function source:</p>

<!-- [[[cog include("templite.py", first="if ops_stack:", numblanks=2, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">        <span class="kw">if</span> ops_stack:
            <span class="ot">self</span>._syntax_error(<span class="st">&quot;Unmatched action tag&quot;</span>, ops_stack[-<span class="dv">1</span>])

        flush_output()</code></pre>

<!-- [[[end]]] -->

<p>We had created a section at the beginning of the function. Its role was to unpack template variables from the context into Python locals. Now that we've processed the entire template, we know the names of all the variables, so we can write the lines in this prologue.</p>

<p>We have to do a little work to know what names we need to define. Looking at our sample template:</p>

<pre class="sourceCode html"><code class="sourceCode html"><span class="kw">&lt;p&gt;</span>Welcome, {{user_name}}!<span class="kw">&lt;/p&gt;</span>
<span class="kw">&lt;p&gt;</span>Products:<span class="kw">&lt;/p&gt;</span>
<span class="kw">&lt;ul&gt;</span>
{% for product in product_list %}
    <span class="kw">&lt;li&gt;</span>{{ product.name }}:
        {{ product.price|format_price }}<span class="kw">&lt;/li&gt;</span>
{% endfor %}
<span class="kw">&lt;/ul&gt;</span></code></pre>

<p>There are two variables used here, <code>user_name</code> and <code>product</code>. The <code>all_vars</code> set will have both of those names, because both are used in <code>{{...}}</code> expressions. But only <code>user_name</code> needs to be extracted from the context in the prologue, because <code>product</code> is defined by the loop.</p>

<p>All the variables used in the template are in the set <code>all_vars</code>, and all the variables defined in the template are in <code>loop_vars</code>. All of the names in <code>loop_vars</code> have already been defined in the code because they are used in loops. So we need to unpack any name in <code>all_vars</code> that isn't in <code>loop_vars</code>:</p>

<!-- [[[cog include("templite.py", first="for var_name", numblanks=1, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">        <span class="kw">for</span> var_name in <span class="ot">self</span>.all_vars - <span class="ot">self</span>.loop_vars:
            vars_code.add_line(<span class="st">&quot;c_</span><span class="ot">%s</span><span class="st"> = context[</span><span class="ot">%r</span><span class="st">]&quot;</span> % (var_name, var_name))</code></pre>

<!-- [[[end]]] -->

<p>Each name becomes a line in the function's prologue, unpacking the context variable into a suitably named local variable.</p>

<p>We're almost done compiling the template into a Python function. Our function has been appending strings to <code>result</code>, so the last line of the function is simply to join them all together and return them:</p>

<!-- [[[cog include("templite.py", first='code.add_line("return', numlines=2, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">        code.add_line(<span class="st">&quot;return &#39;&#39;.join(result)&quot;</span>)
        code.dedent()</code></pre>

<!-- [[[end]]] -->

<p>Now that we've finished writing the source for our compiled Python function, we need to get the function itself from our CodeBuilder object. The <code>get_globals</code> method executes the Python code we've been assembling. Remember that our code is a function definition (starting with <code>def render_function(..):</code>), so executing the code will define <code>render_function</code>, but not execute the body of <code>render_function</code>.</p>

<p>The result of <code>get_globals</code> is the dictionary of values defined in the code. We grab the <code>render_function</code> value from it, and save it as an attribute in our Templite object:</p>

<!-- [[[cog include("templite.py", first="self._render_function =", numlines=1, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">        <span class="ot">self</span>._render_function = code.get_globals()[<span class="st">&#39;render_function&#39;</span>]</code></pre>

<!-- [[[end]]] -->

<p>Now <code>self._render_function</code> is a callable Python function. We'll use it later, during the rendering phase.</p>

<h4 id="compiling-expressions">Compiling Expressions</h4>

<p>We haven't yet seen a significant piece of the compiling process: the <code>_expr_code</code> method that compiles a template expression into a Python expression. Our template expressions can be as simple as a single name:</p>

<pre><code>{{user_name}}</code></pre>

<p>or can be a complex sequence of attribute accesses and filters:</p>

<pre><code>{{user.name.localized|upper|escape}}</code></pre>

<p>Our <code>_expr_code</code> method will handle all of these possibilities. As with expressions in any language, ours are built recursively: big expressions are composed of smaller expressions. A full expression is pipe-separated, where the first piece is dot-separated, and so on. So our function naturally takes a recursive form:</p>

<!-- [[[cog include("templite.py", first="def _expr_code", numlines=2, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">    <span class="kw">def</span> _expr_code(<span class="ot">self</span>, expr):
        <span class="co">&quot;&quot;&quot;Generate a Python expression for `expr`.&quot;&quot;&quot;</span></code></pre>

<!-- [[[end]]] -->

<p>The first case to consider is that our expression has pipes in it. If it does, then we split it into a list of pipe-pieces. The first pipe-piece is passed recursively to <code>_expr_code</code> to convert it into a Python expression.</p>

<!-- [[[cog include("templite.py", first="if ", after="def _expr_code", numlines=6, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">        <span class="kw">if</span> <span class="st">&quot;|&quot;</span> in expr:
            pipes = expr.split(<span class="st">&quot;|&quot;</span>)
            code = <span class="ot">self</span>._expr_code(pipes[<span class="dv">0</span>])
            <span class="kw">for</span> func in pipes[<span class="dv">1</span>:]:
                <span class="ot">self</span>._variable(func, <span class="ot">self</span>.all_vars)
                code = <span class="st">&quot;c_</span><span class="ot">%s</span><span class="st">(</span><span class="ot">%s</span><span class="st">)&quot;</span> % (func, code)</code></pre>

<!-- [[[end]]] -->

<p>Each of the remaining pipe pieces is the name of a function. The value is passed through the function to produce the final value. Each function name is a variable that gets added to <code>all_vars</code> so that we can extract it properly in the prologue.</p>

<p>If there were no pipes, there might be dots. If so, split on the dots. The first part is passed recursively to <code>_expr_code</code> to turn it into a Python expression, then each dot name is handled in turn:</p>

<!-- [[[cog include("templite.py", first="elif ", after="def _expr_code", numlines=5, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">        <span class="kw">elif</span> <span class="st">&quot;.&quot;</span> in expr:
            dots = expr.split(<span class="st">&quot;.&quot;</span>)
            code = <span class="ot">self</span>._expr_code(dots[<span class="dv">0</span>])
            args = <span class="st">&quot;, &quot;</span>.join(<span class="dt">repr</span>(d) <span class="kw">for</span> d in dots[<span class="dv">1</span>:])
            code = <span class="st">&quot;do_dots(</span><span class="ot">%s</span><span class="st">, </span><span class="ot">%s</span><span class="st">)&quot;</span> % (code, args)</code></pre>

<!-- [[[end]]] -->

<p>To understand how dots get compiled, remember that <code>x.y</code> in the template could mean either <code>x['y']</code> or <code>x.y</code> in Python, depending on which works; if the result is callable, it's called. This uncertainty means that we have to try those possibilities at run time, not compile time. So we compile <code>x.y.z</code> into a function call, <code>do_dots(x, 'y', 'z')</code>. The dot function will try the various access methods and return the value that succeeded.</p>

<p>The <code>do_dots</code> function is passed into our compiled Python function at run time. We'll see its implementation in just a bit.</p>

<p>The last clause in the <code>_expr_code</code> function handles the case that there was no pipe or dot in the input expression. In that case, it's just a name. We record it in <code>all_vars</code>, and access the variable using its prefixed Python name:</p>

<!-- [[[cog include("templite.py", first="else:", after="def _expr_code", numlines=4, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">        <span class="kw">else</span>:
            <span class="ot">self</span>._variable(expr, <span class="ot">self</span>.all_vars)
            code = <span class="st">&quot;c_</span><span class="ot">%s</span><span class="st">&quot;</span> % expr
        <span class="kw">return</span> code</code></pre>

<!-- [[[end]]] -->

<h4 id="helper-functions">Helper Functions</h4>

<p>During compilation, we used a few helper functions. The <code>_syntax_error</code> method simply puts together a nice error message and raises the exception:</p>

<!-- [[[cog include("templite.py", first="def _syntax_error", numblanks=1, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">    <span class="kw">def</span> _syntax_error(<span class="ot">self</span>, msg, thing):
        <span class="co">&quot;&quot;&quot;Raise a syntax error using `msg`, and showing `thing`.&quot;&quot;&quot;</span>
        <span class="kw">raise</span> TempliteSyntaxError(<span class="st">&quot;</span><span class="ot">%s</span><span class="st">: </span><span class="ot">%r</span><span class="st">&quot;</span> % (msg, thing))</code></pre>

<!-- [[[end]]] -->

<p>The <code>_variable</code> method helps us with validating variable names and adding them to the sets of names we collected during compilation. We use a regex to check that the name is a valid Python identifier, then add the name to the set:</p>

<!-- [[[cog include("templite.py", first="def _variable", numblanks=4, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">    <span class="kw">def</span> _variable(<span class="ot">self</span>, name, vars_set):
        <span class="co">&quot;&quot;&quot;Track that `name` is used as a variable.</span>

<span class="co">        Adds the name to `vars_set`, a set of variable names.</span>

<span class="co">        Raises an syntax error if `name` is not a valid name.</span>

<span class="co">        &quot;&quot;&quot;</span>
        <span class="kw">if</span> not re.match(<span class="st">r&quot;[_a-zA-Z][_a-zA-Z0-9]*$&quot;</span>, name):
            <span class="ot">self</span>._syntax_error(<span class="st">&quot;Not a valid name&quot;</span>, name)
        vars_set.add(name)</code></pre>

<!-- [[[end]]] -->

<p>With that, the compilation code is done!</p>

<h4 id="rendering">Rendering</h4>

<p>All that's left is to write the rendering code. Since we've compiled our template to a Python function, the rendering code doesn't have much to do. It has to get the data context ready, and then call the compiled Python code:</p>

<!-- [[[cog include("templite.py", first="def render(", numblanks=3, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">    <span class="kw">def</span> render(<span class="ot">self</span>, context=<span class="ot">None</span>):
        <span class="co">&quot;&quot;&quot;Render this template by applying it to `context`.</span>

<span class="co">        `context` is a dictionary of values to use in this rendering.</span>

<span class="co">        &quot;&quot;&quot;</span>
        <span class="co"># Make the complete context we&#39;ll use.</span>
        render_context = <span class="dt">dict</span>(<span class="ot">self</span>.context)
        <span class="kw">if</span> context:
            render_context.update(context)
        <span class="kw">return</span> <span class="ot">self</span>._render_function(render_context, <span class="ot">self</span>._do_dots)</code></pre>

<!-- [[[end]]] -->

<p>Remember that when we constructed the <code>Templite</code> object, we started with a data context. Here we copy it, and merge in whatever data has been passed in for this rendering. The copying is so that successive rendering calls won't see each others' data, and the merging is so that we have a single dictionary to use for data lookups. This is how we build one unified data context from the contexts provided when the template was constructed, with the data provided now at render time.</p>

<p>Notice that the data passed to <code>render</code> could overwrite data passed to the Templite constructor. That tends not to happen, because the context passed to the constructor has global-ish things like filter definitions and constants, and the context passed to <code>render</code> has specific data for that one rendering.</p>

<p>Then we simply call our compiled <code>render_function</code>. The first argument is the complete data context, and the second argument is the function that will implement the dot semantics. We use the same implementation every time: our own <code>_do_dots</code> method.</p>

<!-- [[[cog include("templite.py", first="def _do_dots", numblanks=1, dedent=False) ]]] -->

<pre class="sourceCode python"><code class="sourceCode python">    <span class="kw">def</span> _do_dots(<span class="ot">self</span>, value, *dots):
        <span class="co">&quot;&quot;&quot;Evaluate dotted expressions at runtime.&quot;&quot;&quot;</span>
        <span class="kw">for</span> dot in dots:
            <span class="kw">try</span>:
                value = <span class="dt">getattr</span>(value, dot)
            <span class="kw">except</span> <span class="ot">AttributeError</span>:
                value = value[dot]
            <span class="kw">if</span> <span class="dt">callable</span>(value):
                value = value()
        <span class="kw">return</span> value</code></pre>

<!-- [[[end]]] -->

<p>During compilation, a template expression like <code>x.y.z</code> gets turned into <code>do_dots(x, 'y', 'z')</code>. This function loops over the dot-names, and for each one tries it as an attribute, and if that fails, tries it as a key. This is what gives our single template syntax the flexibility to act as either <code>x.y</code> or <code>x['y']</code>. At each step, we also check if the new value is callable, and if it is, we call it. Once we're done with all the dot-names, the value in hand is the value we want.</p>

<p>Here we used Python argument unpacking again (<code>*dots</code>) so that <code>_do_dots</code> could take any number of dot names. This gives us a flexible function that will work for any dotted expression we encounter in the template.</p>

<p>Note that when calling <code>self._render_function</code>, we pass in a function to use for evaluating dot expressions, but we always pass in the same one. We could have made that code part of the compiled template, but it's the same eight lines for every template, and those eight lines are part of the definition of how templates work, not part of the details of a particular template. It feels cleaner to implement it like this than to have that code be part of the compiled template.</p>

<h2 id="testing">Testing</h2>

<p>Provided with the template engine is a suite of tests that cover all of the behavior and edge cases. I'm actually a little bit over my 500-line limit: the template engine is 252 lines, and the tests are 275 lines. This is typical of well-tested code: you have more code in your tests than in your product.</p>

<h2 id="whats-left-out">What's Left Out</h2>

<p>Full-featured template engines provide much more than we've implemented here. To keep this code small, we're leaving out interesting ideas like:</p>

<ul>
<li>Template inheritance and inclusion</li>
<li>Custom tags</li>
<li>Automatic escaping</li>
<li>Arguments to filters</li>
<li>Complex conditional logic like else and elif</li>
<li>Loops with more than one loop variable</li>
<li>Whitespace control</li>
</ul>

<p>Even so, our simple template engine is useful. In fact, it is the template engine used in coverage.py to produce its HTML reports.</p>

<h2 id="summing-up">Summing up</h2>

<p>In 252 lines, we've got a simple yet capable template engine. Real template engines have many more features, but this code lays out the basic ideas of the process: compile the template to a Python function, then execute the function to produce the text result.</p>
  </body>
</html>
